Confirming the attainment of maximal oxygen uptake within special and clinical groups: A systematic review and meta-analysis of cardiopulmonary exercise test and verification phase protocols

Background and aim A plateau in oxygen uptake (V˙O2) during an incremental cardiopulmonary exercise test (CPET) to volitional exhaustion appears less likely to occur in special and clinical populations. Secondary maximal oxygen uptake (V˙O2max) criteria have been shown to commonly underestimate the actual V˙O2max. The verification phase protocol might determine the occurrence of ‘true’ V˙O2max in these populations. The primary aim of the current study was to systematically review and provide a meta-analysis on the suitability of the verification phase for confirming ‘true’ V˙O2max in special and clinical groups. Secondary aims were to explore the applicability of the verification phase according to specific participant characteristics and investigate which test protocols and procedures minimise the differences between the highest V˙O2 values attained in the CPET and verification phase. Methods Electronic databases (PubMed, Web of Science, SPORTDiscus, Scopus, and EMBASE) were searched using specific search strategies and relevant data were extracted from primary studies. Studies meeting inclusion criteria were systematically reviewed. Meta-analysis techniques were applied to quantify weighted mean differences (standard deviations) in peak V˙O2 from a CPET and a verification phase within study groups using random-effects models. Subgroup analyses investigated the differences in V˙O2max according to individual characteristics and test protocols. The methodological quality of the included primary studies was assessed using a modified Downs and Black checklist to obtain a level of evidence. Participant-level V˙O2 data were analysed according to the threshold criteria reported by the studies or the inherent measurement error of the metabolic analysers and displayed as Bland-Altman plots. Results Forty-three studies were included in the systematic review, whilst 30 presented quantitative information for meta-analysis. Within the 30 studies, the highest mean V˙O2 values attained in the CPET and verification phase protocols were similar (mean difference = -0.00 [95% confidence intervals, CI = -0.03 to 0.03] L·min-1, p = 0.87; level of evidence, LoE: strong). The specific clinical groups with sufficient primary studies to be meta-analysed showed a similar V˙O2max between the CPET and verification phase (p > 0.05, LoE: limited to strong). Across all 30 studies, V˙O2max was not affected by differences in test protocols (p > 0.05; LoE: moderate to strong). Only 23 (53.5%) of the 43 reviewed studies reported how many participants achieved a lower, equal, or higher V˙O2 value in the verification phase versus the CPET or reported or supplied participant-level V˙O2 data for this information to be obtained. The percentage of participants that achieved a lower, equal, or higher V˙O2 value in the verification phase was highly variable across studies (e.g. the percentage that achieved a higher V˙O2 in the verification phase ranged from 0% to 88.9%). Conclusion Group-level verification phase data appear useful for confirming a specific CPET protocol likely elicited V˙O2max, or a reproducible V˙O2peak, for a given special or clinical group. Participant-level data might be useful for confirming whether specific participants have likely elicited V˙O2max, or a reproducible V˙O2peak, however, more research reporting participant-level data is required before evidence-based guidelines can be given. Trial registration PROSPERO (CRD42021247658) https://www.crd.york.ac.uk/prospero.


Introduction
Maximal oxygen uptake ( _ VO 2max ) represents the upper physiological limit of utilising oxygen to produce energy during volitional exercise to exhaustion [1].The original concept emerged in the 1920's in the seminal works of Hill and colleagues [2,3].These authors described this phenomenon as the "ceiling" of oxygen uptake ( _ VO 2 ) during a discontinuous step-incremented exercise test, beyond which no additional increase in _ VO 2 is observed despite an increase in work rate.In special groups such as apparently healthy children and older adults, and clinical groups such as people with chronic respiratory and metabolic conditions, _ VO 2max testing is increasingly recommended and typically determined using an incremental cardiopulmonary exercise test (CPET) with concurrent recording of electrocardiography, blood pressure, and phase is conceptually like the discontinuous _ VO 2max tests that were used in developing _ VO 2 plateau criteria, but has the advantage of requiring only one visit to the laboratory [28].The verification phase has emerged as a potentially valid alternative for establishing whether a 'true' _ VO 2max has been attained [28,29].A recent meta-analysis of studies recruiting apparently healthy participants reported that unlike traditional _ VO 2max criteria, the verification phase is not affected by the _ VO 2max test protocol or procedures, or participant characteristics such as sex and level of cardiorespiratory fitness [29].Considering the _ VO 2 plateau is less likely to occur in unfit participants, and that secondary criteria commonly underestimate _ VO 2max in special and clinical groups [30][31][32], the verification phase might be particularly useful to establish the occurrence of 'true' _ VO 2max or, when _ VO 2max has not been elicited, the highest possible attainable _ VO 2 in these groups.Notably, participant and test protocol characteristics may not allow the attainment of _ VO 2max during the CPET or verification phase.This has direct clinical applications in establishing functional capacity and the effectiveness of exercise training, and the subsequent evaluation of health risks in clinical populations.However, no systematic review and meta-analysis has investigated the utility of the verification phase across diverse special and clinical groups according to different CPET and verification phase protocols and procedures.The effect of these factors on the utility of the verification phase in special and clinical groups is therefore unclear.
The primary aim of the current study was to systematically review and provide a meta-analysis on the suitability of the verification phase for confirming 'true' _ VO 2max in special and clinical groups.Secondary aims were to explore the applicability of the verification phase according to specific participant characteristics and investigate which test protocols and procedures minimise the differences between the highest _ VO 2 values attained in the CPET and verification phase.

Protocol and registration
The systematic review and meta-analysis were performed and reported in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) [33].A full PRISMA checklist is shown in S1 Checklist.The protocol was registered at https://www.crd.york.ac.uk/prospero (CRD42021247658).

Search strategy
MEDLINE (accessed through PubMed), Web of Science, SPORTDiscus, Scopus, and EMBASE were searched for peer-reviewed literature.The search strategy included terms relating to cardiopulmonary exercise test, verification phase, _ VO 2max test, and oxygen uptake, using a combination of entry terms and synonyms.Medical subject heading (MeSH) descriptors were also included in the PubMed search.All studies published from the inception of the databases until the search date (14 th October 2023) were sought.All references from electronic search results were imported into Endnote bibliographic software (version X9, Bld 12062, Clarivate) and duplicates were removed.The electronic searches were re-run before the final analysis and further studies were retrieved for inclusion.A list containing the full search strategy for each database is available (see S1 Fig).Backward searching for additional relevant studies was conducted by scrutinising the reference lists of the full-text articles of the initial included studies.Forward searching for additional relevant studies was conducted within electronic databases by scrutinising studies that have cited the initial included studies since their publication.Eligibility criteria for inclusion were studies published in English or Portuguese language involving 1) Population-individuals affected by any disease, disability or clinical condition, apparently healthy children and adolescents (<18 years of age), and older adults (� 65 years of age) according to American College of Sports Medicine and American Heart Association definitions [34]; 2) Type of study-any research design that included at least one CPET and at least one verification phase carried out on a cycle ergometer, while walking or running or wheelchair propulsion on a treadmill, or on a ski-/wheelchair-/arm-ergometer; and 3) Outcome-_ VO 2max determined using expired gas analysis during the maximal CPET (control) and verification phase (comparator).Studies were excluded if: 1) they involved secondary analysis of previously included studies; and 2) they investigated older adults, and it could not be ascertained whether any of the participants were below 65 years old.Two blinded researchers performed the searches and screening procedures.

Study selection
Studies were screened for inclusion using a three-step approach: 1) titles and abstracts were initially screened for potentially eligible articles; 2) the full texts of all potentially eligible articles were obtained; and 3) all inclusion and exclusion criteria were applied to full-text articles for the final decision on eligibility.Two of the authors independently determined whether each study met the eligibility criteria.Disagreements were resolved by discussion.

Data extraction and management
The following data were systematically extracted to a Microsoft Excel (2013) spreadsheet: 1) total sample size; 2) characteristics of study participants (special population or clinical condition, sex, age, body mass index, and cardiorespiratory fitness); 3) exercise modality; 4) type of CPET protocol; 5) verification phase protocol and procedures (work rate, type of recovery, timing in relation to whether performed on the same day or different day to the CPET, and whether or not a verification phase threshold criterion was used); and 6) outcome measures (mean ± standard deviation [SD] test duration and absolute _ VO 2max for the CPET and verification phase).Authors of the original articles were contacted to request data when these were not reported.Non-responses from authors were followed up with a second email.

Quality assessment
The quality of the included studies was independently assessed by two of the authors using a modified version of the Downs and Black checklist [35] (see S2 Fig) .Modified versions of this checklist have been employed in reviews in the sport and exercise sciences, which also mainly used cross-sectional studies for their data retrieval [36,37].The original checklist comprises 27 items, which are distributed over five sub-scales: reporting (items 1-10), external validity (items 11-13), bias (items 14-20), confounding (items 21-26), and power (item 27) [35].The Downs and Black checklist was originally designed for intervention studies.Since the present review does not focus on intervention studies, items 8, 9, 12-16, 19, and 22-26 were excluded, and the remaining 14 items included.Furthermore, an additional item was added on whether the included studies provided information on the sampling method.The term "patient" was replaced by "participant", the term "principal confounders" by "participant characteristics", and, where applicable, the term "treatment" was interpreted in the context of "testing" [36,37].All items, except item numbers 4 and 6, were rated as "Yes" (1 point), "No" (0 points), or "Unknown" (0 points).
For item 4, both the CPET and verification phase needed to be described in sufficient detail, i.e., 1) the duration and magnitude of increments in the CPET; 2) total duration of the CPET; 3) whether the verification phase used a sub-or supra-peak work rate/speed during the CPET; 4) total duration of the verification phase; 5) exercise modality used for the CPET and verification phase; and 6) type and duration of recovery between the CPET and verification phase.Two points were given if all six items were described, and one point was given if four or five of the six items were described.For item 6, both simple outcome data for the major findings of the study, as well as values for _ VO 2 and its unit of measurement needed to be sufficiently described for this item to be scored a "Yes" with 2 points.Only providing the percentage change or absolute difference in _ VO 2 between the CPET to exhaustion and verification phase was not sufficient.If only one was sufficiently described, it was rated as a "Yes" with 1 point.A third author helped reach consensus where there were disagreements between the primary reviewers.Quality assessment cut-off points were decided on retrospectively and studies were regarded as low (0-8 points), moderate (9-14 points), or good (15-17 points) methodological quality, based on the total score achieved on the modified Downs and Black checklist.The level of evidence (LoE) for the results of the main and subgroup analyses was categorised from very limited to strong by combining the quality scores of each of the studies included (Table 1).A figure showing the results of the quality assessments was constructed using the R studio programme (R Core Team.R: A language and environment for statistical computing.R foundation for statistical computing Vienna, Austria; 2013).

Statistical analysis
All meta-analyses were performed using the Review Manager (RevMan) software version 5.3 (Copenhagen, The Nordic Cochrane Centre, The Cochrane Collaboration, 2014).Data are presented as the mean ± SD, following Cochrane Handbook guidelines [39].These guidelines state that to perform a meta-analysis of continuous data, authors should utilise the mean value, standard deviation, and number of participants for whom the outcome was measured in each intervention group or test protocol.The outcome was the mean difference (95% confidence interval [CI]) between the CPET and verification phase for the highest absolute _ VO 2 value in L�min -1 .Given that absolute _ VO 2 are continuous data, the weighted mean difference method was used for combining study effect size estimates.With the weighted mean difference method, the pooled effect estimate represents a weighted mean of all included study group comparisons.The weighting assigned to each individual study group (i.e., the comparison of CPET and verification phase results) in the analysis was inversely proportional to the variance of the absolute _ VO 2 .This method typically assigns more weight in the meta-analysis to studies with higher precision (inverse variance) and larger sample sizes.The weighted mean differences were calculated using random-effects models given the study group differences in participants' characteristics, CPET modalities and protocols, types of recovery, and verification phase protocols.These differences in both participants and protocols characteristics allow that the effect size could vary from study to study.A standardised mean difference method with pooled effect estimate represents a weighted standardised mean of all included study group comparisons and was included as a sensitivity check given the differences in absolute _ VO 2 observed across different clinical groups.
Heterogeneity of net group changes in absolute _ VO 2max was examined using the Q statistic.Cochran's Q statistic is computed by summing the squared deviations of each trial's estimate from the overall meta-analytic estimate and weighting each trial's contribution in the same manner as in the meta-analysis.The p-values were obtained by comparing the statistic with a χ 2 distribution with k-1 degrees of freedom (where k is the number of trials).A p-value of < 0.10 was adopted since the Q statistic tends to suffer from low differential power [40].The formal Q statistic was used in conjunction with the methods for assessing heterogeneity.The I 2 statistic was used to measure the extent of inconsistency among the results of the primary study groups, interpreted approximately as the proportion of total variation in point estimates due to heterogeneity rather than sampling error.Effect sizes with a corresponding I 2 value of � 50% were considered to have low heterogeneity.Potential publication bias or studies with outlier data were assessed using a funnel plot.
Subgroup analyses were defined a priori to investigate the magnitude of differences between CPET and verification phases due to variations in group characteristics, exercise modality, CPET protocol design, or how the verification phase was performed.The following subgroups based on medical conditions and participant characteristics were considered: paediatric (obese and non-obese under 18 yrs), geriatric (� 65 yrs), wheelchair (elite wheelchair athletes and individuals in a wheelchair without spinal cord injury or spina bifida), respiratory (cystic fibrosis, chronic asthma/airway disorders, and bronchiectasis), metabolic (overweight and obese adults with and without metabolic syndrome or hypertension), oncological, and cardiological.Forest plots were constructed to display values at the 95% confidence level.Effect sizes were calculated by subtracting the highest mean values for absolute _ VO 2 observed in the CPET from the verification phase values, based on grouping studies with selected verification phase characteristics for work rate (i.e., sub vs. supra peak work rate) and type of recovery between the CPET and verification phase (i.e., active vs. passive).The studies were also classified according to whether a criterion threshold for _ VO 2max was used for the verification phase, involving an absolute or relative differences in _ VO 2 between the CPET and verification phase (i.e., yes vs. no), typically characterised as a percentage difference between tests.In addition to the meta-analytical approach, a participant-level analysis was conducted with data directly reported in the reviewed studies or those supplied by the corresponding authors.The analysis of the differences between the highest _ VO 2 values elicited in the CPET and verification phase was based on the threshold criteria utilised by the studies or the error of measurement of the metabolic analysers that were used.

Results
The literature search identified 2108 potential studies, of which 2082 were obtained from electronic databases and 26 from manual searches through a wider inspection of reference lists and citations of these articles.Forty-three articles published between 1993 and 2023 met the eligibility criteria and were included in the systematic review, whilst 30 presented relevant quantitative information to be considered for meta-analysis (see Fig 1).
Tables 2 and 3 show, respectively, the sample characteristics for the reviewed studies and the exercise testing protocols used to measure the _ VO 2max in all participants.Twenty-two studies (51%) used continuous step-incremented protocols, 19 (44%) used ramp-incremented protocols, one used a discontinuous protocol (2%), and one included both continuous and discontinuous step-incremented protocols (2%).Twenty-three studies (53%) used one or more traditional _ VO 2max criteria, of which 17 (39%) used a _ VO 2 plateau, 16 (37%) used a heart rate plateau or criteria based on the age-predicted maximal heart rate, 17 (39%) used maximal respiratory exchange ratio, 5 (12%) used post-exercise blood lactate concentration, and 6 (14%) used ratings of perceived exertion cut-off values.
Regarding respiratory expired gas analysis procedures, smoothing of pulmonary gas exchange data is required during exercise testing for the determination of _ VO 2max , especially for data collected from participants on a breath-by-breath basis.The most common approach was based on time averages.Thirty-two studies (74%) reported using time averages of between 10 and 30 s, two (5%) used moving time averages, two (5%) applied 12-breath rolling averages, and three (7%) did not describe which _ VO 2 data processing method was applied.Amongst more traditional expired gas collection techniques, four studies (9%) used Douglas bag collections of between 30 and 60 s.No study addressed the effect of different _ VO 2 sampling intervals on the difference between the peak _ VO 2 values attained in the CPET and verification phase.Regarding the type of recovery between CPET termination and the start of the verification phase, 10 studies (23%) used active recovery, 11 (26%) used passive recovery, 8 (19%) adopted a combination of passive and active recovery, and 11 (26%) did not report the type of recovery.The verification phase was carried out on a different day as the CPET in three studies (7%).When the verification phase was performed on the same day as the CPET, the recovery period varied from 4 to 25 min.Ten studies (23%) used a 10-min recovery, which was the most common.Two articles (5%) did not state the duration of the recovery period used in the studies.Twenty-five studies (58%) used square-wave verification phase protocols (i.e., the work rate was immediately increased to the target sub or supra peak work rate), whereas 17 (40%) used multistage verification phase protocols characterised by an initial warm-up stage.Only one study (2%) did not describe the verification phase protocol.The peak work rate used in the verification phase protocols ranged from 80% to 110% of the peak work rate attained in the CPET across studies.Most studies applied a supra peak work rate based on the peak work rate achieved in the CPET (n = 33; 77%).Three studies (7%) applied both sub and supra peak work rates within the same study, two (5%) used peak work rate, four (9%) applied only a sub peak work rate, and one (2%) did not describe the verification phase work rate.The mean times to exhaustion for the CPET and verification phase were 568 s (SD, 143 s) and 127 s (SD, 57 s), respectively.
Twenty-two studies (51%) employed threshold criteria to analyse differences between the highest _ VO 2 attained in the CPET and verification phase and were frequently based on the intra-subject coefficient of variation acquired from the researchers' laboratories or from published literature.Threshold criteria included a difference in _ VO 2 (L�min -1 ) of < 2%, < 3%, < 5% and < 9%, and an absolute difference between measured and predicted _ VO 2 from linear extrapolations of _ VO 2 / work rate responses during the CPET (such as < 50% of the "expected" increase).Other cut-off points included a typical error of 0.06 L�min -1 , 2.1 mL�kg - 1 �min -1 , or when the comparison of the highest group mean _ VO 2 value obtained in the CPET versus the verification phase resulted in p > 0.05.

Methodological quality of the included studies
There was 86% agreement between the two authors in ranking the items initially, and full agreement was reached upon discussion with a third author.Two studies (5%) were rated as having low methodological quality, 37 (86%) as moderate, and four (9%) as good (Fig 2).The   was not different between CPET and verification phase (mean difference = -0.00[95% CI = -0.03 to 0.03] L�min -1 , p = 0.87; LoE: strong).Given the potential for large heterogeneity in mean _ VO 2max values across the different populations included in the review, the overall and subgroup meta-analysis findings were robust to a sensitivity check using the alternative statistical approach of using standardised mean differences.This method resulted in a different weighting pattern to the individual study group differences between the CPET and verification phase _ VO 2max values (data not presented).However, the overall effect was unchanged (standardised mean difference = -0.01[95% CI = -0.09 to 0.08] L�min -1 , p = 0.87, LoE: strong).Pooled data for _ VO 2max following the CPET and verification phase showed no significant heterogeneity among all the studies (see Fig 3).Except for one of the included studies judged to be an outlier [44], the meta-analysed studies were judged to have a low-risk of bias as shown by the funnel plot (Fig 4).

Group-level quantitative data synthesis: Differences between the highest _ VO 2 attained in the CPET and verification phase
There was no statistically significant difference between CPET and verification-derived _ VO 2max for the paediatric group that included seven studies with 18 experimental conditions (mean difference = -0.01[95% CI = -0.05 to 0.03] L�min -1 , p = 0.68, LoE: moderate).Additionally, the subgroup analysis of obese and non-obese paediatric participants, composed of two studies with four experimental conditions, revealed no statistically significant difference (mean difference = -0.11[95% CI = -0.24 to 0.01] L�min -1 , p = 0.08, LoE: moderate).The wheelchair group consisted of three studies with eight experimental conditions, and no statistically significant difference was observed between the CPET and verification phase (mean difference = 0.11 [95% CI = -0.02 to 0.23] L�min -1 , p = 0.10, LoE: strong).The chronic respiratory group consisted of 6 studies and 11 experimental conditions that demonstrated no statistically significant significance between the CPET and verification phase (mean difference = 0.07 [95% CI = -0.02 to 0.17] L�min -1 , p = 0.14, LoE: strong).The 15 girls from the study by Robben et al. [69] were removed from the meta-analysis, since the reported SD was an extreme outlier (e.g., the study would have been weighted at 40.7% in the final meta-analysis).The subgroup  of paediatric patients with cystic fibrosis included four studies and seven experimental conditions (mean difference = 0.06 [95% CI = -0.13 to 0.25] L�min -1 , p = 0.55 LoE: strong).The geriatric group incorporated four studies with seven experimental conditions, and results showed no statistically significant difference between the CPET and verification phase (mean difference of -0.08 [95% CI = -0.21 to 0.05] L�min -1 , p = 0.20 LoE: moderate).Finally, the metabolic group including individuals with overweight or obesity (with and without metabolic disease), comprised six studies and 13 experimental conditions.These studies used ramp-based cycle ergometry and demonstrated no statistically significant difference between CPET and the verification phase (mean difference = -0.06[95% CI = -0.13 to -0.01] L�min -1 , p = 0.09).

Participant-level analysis of the highest _ VO 2 values attained in the CPET and verification phase
Only 23 (53.5%) of the 43 reviewed studies reported how many participants achieved a lower, equal, or higher _ VO 2 value in the verification phase versus the CPET or supplied participantlevel _ VO 2 data from which this information could be obtained.Table 5 shows the percentages of participants that achieved a lower, equal, or higher _ VO 2 value in the verification phase versus the CPET for each study where this information was available.level differences between the highest _ VO 2 values obtained in the CPET and verification phase for the seven studies where these data were available [31, 44-47, 56, 60].

Main findings
To the best of our knowledge, this is the first systematic review and meta-analysis to investigate the utility of the verification phase for confirming _ VO 2max in special and clinical groups.The major findings were: a) overall, the highest _ VO 2 attained in the CPET was not statistically significant different to that obtained in the verification phase across all primary studies included; b) subgroup analysis showed there were no statistically significant differences in the highest _ VO 2 attained in the CPET and verification phase for specific groups; c) across all studies, the difference between the highest _ VO 2 attained in the CPET and verification phase was not affected by test protocol characteristics; d) participant-level verification phase data might be useful for providing evidence of whether the CPET likely elicited _ VO 2max in a given person; e) a _ VO 2 plateau in the CPET does not always confirm _ VO 2max since the verification phase _ VO 2 is sometimes higher in the presence of a _ VO 2 plateau; and f) the included studies did not report any adverse events associated with the verification phase.verification phase.Hence, the authors recommended a verification phase to confirm 'true' _ VO 2max in people with spinal cord injury.Similarly, 24 trained wheelchair athletes (tetraplegics, paraplegics and non-spinal cord-injured) had their _ VO 2max confirmed through a verification phase performed at the same treadmill speed, but a gradient of 0.3% to 0.6% higher than that of the CPET treadmill gradient.

Metabolic group
Our final group sub-analysis within the metabolic group revealed a trend for an attenuated _ VO 2 in the CPET compared to the verification phase in individuals with overweight or obesity, with or without metabolic disease or prehypertension.Notably, all six studies in the subgroup analysis used cycle ergometry.In a trial with four verification phases performed at 80, 90, 100 and 105% WR max , eight of nine men with obesity attained a higher _ VO 2 in the verification phase [57].Although not statistically significant, the highest _ VO 2 during the verification phase at 90% WR max was 0.24 L�min -1 higher than during the CPET, which is a 7% difference.Other data from studies with large samples of individuals who are sedentary, have obesity, and metabolic disease, indicate that the magnitude (3%-9%) and prevalence (40% of people with overweight/obesity) of the underestimation of _ VO 2max during a CPET is high.Regardless, the results from the current meta-analysis and these primary studies indicate that the verification phase appears to be a robust method for confirming _ VO 2max in individuals with overweight or obesity, with or without metabolic disease or prehypertension.

Verification phase characteristics
Regarding verification phase characteristics, Fig 5 illustrates the combined effects of work rate, recovery mode, use of a verification phase threshold, day of the test, and protocol duration.No significant results were found for the combined analysis, nor when each category was analysed separately.Consequently, a specific verification phase protocol cannot currently be recommended for any of the groups included in the present systematic review.These results agree with those found in apparently healthy adults, where the verification phase was not affected by test protocol characteristics [29].Although _ VO 2max was similar among different protocols, notably, no included study applied a sub WR peak verification phase on a treadmill.It therefore remains unclear if relative exercise work rate affects the suitability of the verification phase for confirming _ VO 2max on a treadmill in special and clinical groups.Regarding the mode of recovery (i.e., active or passive) between the CPET and verification phase, our meta-analysis demonstrated that the 20 mL�min -1 mean difference between the highest _ VO 2 values observed in the CPET and verification phase was not statistically significant.Similar to a recent systematic review that investigated the utility of the verification phase in apparently healthy adults [29], we did not find any study that compared active and passive recoveries.In the present systematic review, the time between the CPET and verification phase ranged from 4 to 25 minutes in studies where these tests were performed on the same day.Although no single study compared the verification phase conducted on the same day versus a different day to the CPET, no statistically significant difference was found in the present review after combining the results.We therefore recommend either an active or passive recovery after the CPET, with no need for an additional visit on a separate day.
Considering the importance of a verification phase being individually analysed to identify those who confirmed their _ VO 2max , we suggest the adoption of a threshold criterion to compare the differences between the CPET and verification phase.The threshold criterion value has commonly been based on the reproducibility of _ VO 2max during the CPET and specific to the metabolic cart that was used, or an arbitrary 2-3% difference has been used.Alternatively, some studies have calculated the differences between measured and predicted _ VO 2max [45-47, 61, 62].However, we did not obtain significant differences between studies that did or did not apply a threshold criterion.We systematically recorded whether authors of primary studies commented specifically on participant level data.In fact, 22 studies (51%) in the present systematic review discussed participant level differences between the highest _ VO 2 attained in the CPET and verification phase.We strongly suggest future researchers report participant-level data, since the absence of any significant mean differences in the highest _ VO 2 attained in the CPET and verification phase, may mask individuals who attain a practically significant higher _ VO 2 in the verification phase.
The combined effects of verification phase duration (i.e., short < 80 s, medium 81-120 s, and long > 120 s) resulted in no statistically significant difference between the highest _ VO 2 observed in the CPET and verification phase.Similarly, a previous study did not find any significant correlations between different durations at 110% WR max for the highest _ VO 2 attained in the CPET and verification phase [77].Furthermore, 12 paediatric participants with cystic fibrosis confirmed their _ VO 2max and two showed 9% higher _ VO 2max values during a verification phase performed at 110% WR max (76 ± 22 s).A brief duration may be inadequate to allow sufficient time for oxygen uptake to achieve maximal values, especially in individuals having slow O 2 kinetics such as those with metabolic or respiratory disease [81][82][83].Verification phases shorter than 80 s might elicit _ VO 2max , however, we recommend future studies implement strategies to avoid inappropriately short verification phases and early termination, such as the adoption of multistage verification protocols that incorporate submaximal and supra peak (in relation to the CPET WR max ) work rates.

Participant-level data
Whether the verification phase confirmed that _ VO 2max was likely elicited during the CPET was highly variable across the 23 studies that either reported this information directly, or where participant-level data were available to obtain this information.For example, the verification phase elicited a higher _ VO 2 than the CPET in 0% to 88.9% of participants across studies.This large variability between studies is likely due to differences in the CPET and verification phase protocols that were used for the various special and clinical groups involved.
The verification phase failed to confirm the highest _ VO 2 in all participants in one study involving para ice hockey players [44], which included a CPET with rapidly-incrementing work rates and a verification phase performed at 110% of the WR max achieved in the CPET.A longer CPET duration with a slower ramp rate and/or a lower WR in the verification phase might be more effective in eliciting a higher _ VO 2 when using upper-body exercise modalities in athletes with lower-limb impairments.In other studies where participants elicited a higher _ VO 2 in the CPET, the percentage ranged from 3.6% to 35%.The study that observed 35% involved 22 apparently healthy older adults that performed verification phases at 85% and 110% WR max [77].The authors concluded that 85% WR max was preferable for this population as it more likely confirmed _ VO 2max .Four out of seven people after stroke achieved a _ VO 2 that was > 3% higher in the verification phase versus the CPET (range 4.7% to 24.1%) [60].Similar to the para ice hockey athletes [44], it is unclear whether the same limiting factors during the CPET also occurred during the verification phase, leading to submaximal _ VO 2 values in both protocols.Twelve of 18 (66.7%)children elicited a higher _ VO 2 in the verification phase compared to the CPET [45].A notable finding is five of the seven children that exhibited a _ VO 2 plateau during the CPET elicited a higher _ VO 2 during the verification phase.One study involving nine men with obesity compared verification phases performed at 80%, 90%, 100% and 105% of WR max achieved in the CPET [57].Eight of the men attained a higher _ VO 2 during a verification phase, with the submaximal verification phase performed at 90% WR max eliciting the highest _ VO 2 values.These findings are similar to those observed in older adults in that a sub-peak verification phase Further critical discussion is also needed among researchers regarding how the verification phase should be interpreted when applied to participant-level data.For example, In the first edition of the Canadian Association of Sports Sciences guidelines for the physiological testing of high-performance athletes, Thoden et al. [86] suggested that the highest _ VO 2 value elicited in either the CPET or verification phase should be regarded as _ VO 2max .In the second edition of the guidelines, however, Thoden et al. [87] recommended that an increase in the highest _ VO 2 elicited in the verification phase that is not more than 2% higher than that elicited in the CPET, verifies _ VO 2max was elicited in the CPET.At the participant level, the verification phase can therefore be used to verify whether _ VO 2max was likely elicited in the CPET or used simply as another opportunity for a person to elicit _ VO 2max .

Conclusion
The main finding was that the mean difference in _ VO 2 between the CPET and associated verification phase was similar.In other words, the verification phase confirmed that _ VO 2max had been attained in the CPET.Moreover, a 10-to-15-minute recovery phase and short verification phase following a CPET appears to be safe, well-tolerated, and time-efficient for a diverse range of special and clinical groups.Unlike traditional _ VO 2max criteria, the application of the verification phase was not affected by differences in test protocol or procedures, or participant characteristics, except perhaps for those with overweight or obesity.In those with overweight or obesity, the _ VO 2 attained in the CPET was significantly lower than that obtained in the verification phase.For individuals with obesity and apparently healthy older adults, the selection of sub peak work rates above critical power is desirable.For the remaining groups, it remains somewhat unclear which work rate is most suitable.For paediatrics, a verification phase of 100-105% WR max conducted 15-min after the CPET on a cycle or treadmill has been useful.For the wheelchair group, we can only advise to use a verification phase < 110% WR max .Regarding individuals with chronic respiratory problems, a cycle ergometer is applicable for a verification phase of 3-min at 20 W, then 110% WR max , performed after 5-min of active rest and 10-min of passive rest after the CPET.Older adults, including those with chronic heart failure, may perform a verification phase 5-10 min after the CPET, either on a cycle ergometer or treadmill.If cycling is chosen, a work rate equal to 85-95% WR peak would optimise _ VO 2max attainment.On a treadmill, 2-min at 50%, 1-min at 70%, and then one stage higher than WR peak is recommended.Lastly, adults with obesity or metabolic conditions may perform a verification phase above critical power on a cycle ergometer up to 110% WR peak on the CPET.Some researchers might decide not to conduct a verification phase where results from the present meta-analysis indicate that particular CPET test protocols and procedures for certain special and clinical groups appear to elicit 'true' _ VO 2max values.However, there remains the issue of identifying whether a participant has likely elicited _ VO 2max during a CPET or elicited a submaximal _ VO 2 value due to early test termination.Considering the limitations associated with traditional _ VO 2max criteria, the verification phase remains a reasonable time-efficient alternative in special and clinical populations; however, further research and critical debate are required before robust evidence-based guidelines can be provided.
Fig 1 shows the screening and selection phases.

Fig 2 .
Fig 2. Quality scores for the 43 included studies.Grey dots for items scored as 'no', yellow dots are for items scored 'yes' (1), and blue dots for 'yes' (2).Black dots are added where authors retrospectively provided extra information that would have led to a higher quality appraisal score if included in the original publication.https://doi.org/10.1371/journal.pone.0299563.g002

Fig 4 .
Fig 4. Funnel plot assessment of publication bias for the highest _ V _ O 2 (L�min -1 ) attained in the cardiopulmonary exercise test (CPET) and verification phase.One outlier identified, which may relate to methodological error within the verification phase protocol (see discussion section).https://doi.org/10.1371/journal.pone.0299563.g004 Fig 6 shows participant-

Fig 6 .
Fig 6.Participant-level differences between the highest _ V _ O 2 values obtained in the cardiopulmonary exercise test (CPET) and verification phase for the seven studies where these data were available.https://doi.org/10.1371/journal.pone.0299563.g006

Table 4
shows comparisons between the highest _ VO 2 values elicited in the CPET and verification phase for each study.Fig 3 displays the forest plot of effect sizes and 95% CIs for the highest _ VO 2 values (30 studies) based on the random effects meta-analysis results.The highest _ VO 2

Table 5 . Frequencies and percentages for whether the verification phase (VP) elicited a lower, similar, or higher _ V _ O 2 than the cardiopulmonary exercise test (CPET) for the 23 studies that mentioned or reported this information.
Participant-level data were supplied by the authors for the first seven studies in the table.

2 lower, similar, or higher in VP versus CPET?
2% threshold criterion used to decide whether the highest _ V _ O 2 was similar in the CPET and verification phase, which was based on the inherent variability of the metabolic cart b Participant-level data were provided but the missing frequencies in the table could not be calculated as an appropriate verification threshold criterion could not be identified; c Could not calculate the missing frequencies in the table as the authors used a verification threshold criterion based on the predicted versus measured _ V _ O 2max in the CPET and verification phase; d Total study sample was 135 participants of which 114 performed both a CPET and verification phase.CSI = continuous stepincremented; DSI = discontinuous step-incremented. a